University of Texas at Austin

Upcoming Event: Oden Institute Seminar

Robust Machine Learning for Biomedical Data: Efficiency, Reliability, and Generalizability

Assistant Professor Chenyu You, Stony Brook University

11 – 12PM
Tuesday Feb 17, 2026

POB 6.304

Abstract

In the rapidly growing area of machine learning, there is profound promise in crafting intelligent, data-driven methods for diverse real-world applications. Yet, in safety-critical domains like healthcare, some fundamental challenges remain: (1) The insufficiency of raw biomedical data emphasizes the need for data-efficient and robust learning approaches. (2) The imperative of safety and stability necessitates a cohesive framework that unifies learning with theoretical guarantees. (3) The inherent heterogeneity and distribution shifts in real-world clinical data call for robust and generalizable learning methods. To address these challenges, there are several major directions I have explored: (i) (Robust) Machine Learning for Imperfect Medical Data: The development of machine learning models, particularly in the context of label scarcity, increasingly necessitates the collection of substantial annotated medical data. Moreover, medical data often display a long-tailed class distribution, which consequently results in notable imbalance issues. To this end, there are several growing interests in training machine learning models jointly across imbalanced class distributions and limited annotations. I have developed novel, efficient, statistically consistent algorithms to improve empirical performance for biomedical image analysis. (ii) Learning with Theoretical Guarantees: As machine learning methods have become ubiquitous in clinical decision-making, their reliability and interpretability have become important. This is particularly crucial in the field of biomedical image analysis, where decision outcomes can have profound implications. I have developed novel machine learning algorithms that enable provably accurate anatomical modeling with theoretical guarantees. (iii) Generalize across Diverse Biomedical Data: The development of medical foundation models often requires massive and diverse biomedical data. To this end, I have developed various foundation models for biomedical imaging data and explored novel applications of these models. I have also developed novel medical AI Agents that lead to the scalable and accurate predictive modeling, particularly for distribution shift problems.

Biography

Chenyu You is an Assistant Professor of Data Science in the Department of Applied Mathematics & Statistics and the Department of Computer Science at Stony Brook University. He is also affiliated with the CVLab, AI institute, and Institute for Advanced Computational Science (IACS). He works on AI for health, often with a focus on generalization, and making machine learning more reliable. He received his Ph.D. in 2024 from Yale University under the advisement of James S. Duncan, his M.S. in 2019 from Stanford University under the advisement of Daniel Rubin, and his B.S. in 2017 from Rensselaer Polytechnic Institute under the advisement of Ge Wang, all in electrical engineering. He has also spent wonderful time at Facebook AI Research (FAIR), as well as Google Research. He serves on the Medical Image Computing and Computer-Assisted Intervention Society (MICCAI), and the SUNY AI Symposium Planning Committee, and as associate editors for IEEE Transactions on Medical Imaging, Medical Image Analysis, IEEE Transactions on Neural Networks and Learning Systems, Pattern Recognition, and Transactions on Machine Learning Research. He has received AAAI'26 New Faculty Highlights, MICCAI 2025 NIH Registration Grant Award, and Yale George P. O’Leary Graduate Fellowship, and has been ranked as the World's Top 2% most-cited scientists by Stanford University since 2024, is a member of the Sigma Xi scientific research society, and received the Excellence in Teaching Award for Spring 2025. For more information, please check his website: https://chenyuyou.me/.

Robust Machine Learning for Biomedical Data: Efficiency, Reliability, and Generalizability

Event information

Date
11 – 12PM
Tuesday Feb 17, 2026
Location POB 6.304
Hosted by Ufuk Topcu